ISSN: 2074-9007 (Print)
ISSN: 2074-9015 (Online)
DOI: https://doi.org/10.5815/ijitcs
Website: https://www.mecs-press.org/ijitcs
Published By: MECS Press
Frequency: 6 issues per year
Number(s) Available: 136
IJITCS is committed to bridge the theory and practice of information technology and computer science. From innovative ideas to specific algorithms and full system implementations, IJITCS publishes original, peer-reviewed, and high quality articles in the areas of information technology and computer science. IJITCS is a well-indexed scholarly journal and is indispensable reading and references for people working at the cutting edge of information technology and computer science applications.
IJITCS has been abstracted or indexed by several world class databases: Scopus, Google Scholar, Microsoft Academic Search, CrossRef, Baidu Wenku, IndexCopernicus, IET Inspec, EBSCO, VINITI, JournalSeek, ULRICH's Periodicals Directory, WorldCat, Scirus, Academic Journals Database, Stanford University Libraries, Cornell University Library, UniSA Library, CNKI Scholar, J-Gate, ZDB, BASE, OhioLINK, iThenticate, Open Access Articles, Open Science Directory, National Science Library of Chinese Academy of Sciences, The HKU Scholars Hub, etc..
IJITCS Vol. 17, No. 2, Apr. 2025
REGULAR PAPERS
The hyperparameter tuning process is an essential step for ML model optimization, as it is necessary to improve model performance. However, this enhancement involves high computational resources and time costs. Model tuning can significantly raise energy consumption and consequently increase carbon emissions. Therefore, there is an essential need to construct a new framework for this challenge by adding carbon emissions as a vital consideration along with performance. The paper proposes a novel Sustainable Hyperparameter Optimization (SHPO) framework that uses an optimized multi-objective fitness approach. The framework focuses on ensemble classification models (ECMs) namely, Random Forest, ExtraTrees, XGBoost, and AdaBoost. All these models will be optimized using traditional and advanced techniques like Optuna, Hyperopt, and Grid Search. The proposed framework tracks carbon emissions during model hyperparameter tuning. The methodology uses the Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) as a method of multi-criteria decision-making (MCDM). This TOPSIS method ranks the hyperparameter sets based on both accuracy and carbon emissions. The objective of the multi-objective fitness approach is to reach the best parameter set with high accuracy and low carbon emissions. It is observed from the experimental results that Optuna based Hyperparameter optimization consistently produced low carbon emissions and achieved high predictive accuracy across the majority of benchmark hyperparameter setups for the models.
[...] Read more.The prediction of autism features in relation to age groups has not been definitively addressed, despite the fact that several studies have been conducted using various methodologies. Research in the field of neuroscience has demonstrated that intracranial brain volume and the corpus callosum provide crucial information for the identification of autism spectrum disorder (ASD). Based on these findings, we present Decision Tree-based Autism Prediction System (DT-APS) and Random Forest-based Autism Prediction System (RF-APS) for automatic ASD identification in this paper. These systems utilize characteristics extracted from the corpus callosum and intracranial brain volume, and are based on machine learning techniques. By prioritizing characteristics with the highest discriminatory power for ASD classification, our proposed approaches, DT-APS and RF-APS, have not only enhanced identification accuracy but also simplified the training of machine learning models. The initial step of this method involves dividing each MRI scan into distinct anatomical areas. These areas are adjacent slices in a single 2D image. Each 2D image is mapped to the curvelet space, and the set of GGD parameters characterizes each of the distinct curvelet sub-bands. The AQ-10 dataset was utilized to evaluate the proposed model. When tested on both types of datasets, the suggested prediction model demonstrated superior performance compared to alternative approaches in all relevant metrics, including accuracy, specificity, sensitivity, precision, and false positive rate (FPR).
[...] Read more.The study is devoted to assessing the risks of cyber threats in the future based on expert sampling patterns. One of the key problems of modern cybersecurity is the dynamic nature of threats that change under the influence of technological progress and socio-economic factors. In this context, the authors consider a methodological approach that involves the use of a multi-level analysis of expert opinions. The main emphasis is placed on taking into account the different points of view, experience and professional activities of experts from the public, private and academic sectors. An important stage of the study is the procedure of data cleaning to form a representative sample that takes into account only logically consistent responses of experts. The paper focuses on the integration of the expert sample patterns‘ features. The key differences in threat assessments between different groups of experts depending on their professional role and experience are identified. This made it possible to formulate comprehensive recommendations for strategic cyber risk management focused on both short-term and long-term priorities. The study makes a significant contribution to understanding the peculiarities of cyber risk assessment through the use of multivariate analysis of expert opinions. The proposed methodology allows not only to improve the quality of forecasts of future cyber threats, but also contributes to the creation of adaptive cybersecurity strategies that take into account the specifics of each sector. The findings of the study emphasize the importance of a multidimensional approach to analyzing cyber threats, taking into account the specifics of each expert group. Integration of assessments and consideration of local peculiarities are key to the development of adaptive and effective cyber defense strategies focused on global and local challenges.
[...] Read more.Cerebrovascular disease commonly known as stroke is the third leading cause of disability and mortality in the world. In recent years, technological advancements have transformed the way information is acquired and how problems are solved in diverse fields of human endeavors, including the medical and healthcare sectors. Machine Learning (ML) and data driven techniques have gain prominence in problem solving and have been deployed in the prediction of the occurrences of stroke. This work explores the application of supervised machine learning algorithms for the prediction of stroke, emphasizing the critical need for early prediction to enhance preventive measures. A comprehensive comparison of classification (Support Vector Machine and Random Forest) and regression (Logistic Regression) algorithms was conducted, with concerns on binary stroke outcome (likelihood of stroke and no stroke) data utilizing dataset from the International Stroke Trial database. The Synthetic Minority Oversampling Technique (SMOTE) and K-fold cross validation were used to balance and address the class imbalance in the datasets. The subsequent model comparison demonstrated distinct strengths and weaknesses among the three models. Random Forest (RF) exhibited high accuracy score of 89%, Support Vector Machine (SVM) and Logistic Regression (LR) showed 86% accuracy. LR demonstrated the most balanced predictive performance, achieving high precision for stroke cases and reasonable recall for both classes.
[...] Read more.This study presents the RSKD ensemble classifier, developed with ensemble feature selection techniques, to address high-dimensional, low-sample-size cancer datasets. Ensemble classifiers are advantageous in such scenarios, offering better classification accuracy than traditional methods by combining multiple models. This combination enhances predictive performance on high-dimensional datasets. However, stability—a key factor for consistent performance on unseen data—often involves a tradeoff with accuracy. Ensemble methods, due to their generalization capabilities, exhibit higher stability, with feature selection stability measured using a consistency index, averaging 65–70%.
The RSKD classifier integrates ensemble feature selection methods SU-R and ChS-R, which enhance feature selection stability and classification accuracy. Its performance was evaluated on seven high-dimensional, low-sample-size datasets and compared against state-of-the-art classifiers, including Adaboost, GradientBoost, REPTree, asBagging_FSS, SRKNN, MF-GE, and eAdaBoost with DSC. The RSKD ensemble classifier achieved an accuracy improvement of 7.69% to 12.35% over these methods. Among the feature selection approaches, SU-R combined with RSKD outperformed ChS-R, demonstrating superior results in cancer prediction tasks.
The findings of this study underscore the potential of RSKD for achieving generalized, robust performance on challenging datasets. By leveraging ensemble classifiers and ensemble feature selection techniques, researchers can address the inherent difficulties of high-dimensional, low-sample-size datasets, enhancing both accuracy and stability. This work provides a valuable foundation for developing diverse, heterogeneous ensemble approaches for cancer prediction and similar applications.
The incorporation of distributed generation (DG) in radial distribution systems (RDS) has recently garnered much attention. The prime goal of DG integration is to generate power locally and cut down the total power losses (PL) of RDS to increase the overall efficiency. The present work suggests a hybrid optimization approach integrating loss sensitivity factor (LSF) with a whale optimization algorithm (WOA) to optimize different categories of DG. The LSF locates the ideal site, and WOA optimizes the size. The present study optimizes DG units to minimize the total active power losses (APLT) and enhance the bus voltages (BV). The present work investigates the adaptability of the proposed integrated technique on the small 33-bus and a large 118-bus RDS. The APLT of the 33-bus RDS is minimized from 210.98 kW to 101.3 kW, 124.3 kW, 64.56 kW, and 86.5 kW for Type I, Type II, Type III, and Type IV DG placements, respectively. Correspondingly, the minimum bus voltage (BVmin) is increased from 0.9038 p.u. to 0.9511 p.u., 0.9503 p.u., 0.9608 p.u., and 0.9579 p.u. Likewise, significant PL reduction and bus voltage enhancement are obtained in 118-bus for three units of Type I and Type III DG placements. Further, the adequateness of the hybrid technique is examined for varying power demand on the IEEE 33-bus RDS. The integrated technique effectively narrows the search space of the optimization problem and helps the WOA to find the optimal solution. The simulation outcomes are compared to examine the sovereignty of the proposed optimization technique.
[...] Read more.This paper presents a comprehensive survey of QE techniques in IR. Core techniques, employed data sources, and methodologies used in the process of query expansion are discussed. The output study highlights four main steps concerned with expanding queries: steps related to preprocessing of data sources and term extraction, calculation of weights and ranking of terms, selection of terms, and finally expansion. The most important findings are that only effective text normalization and removal of stopwords provide a real platform for performing QE. The introduction of contextually relevant terms significantly enhanced relevance feedback and thesaurus-based WordNet expansion techniques. They have been shown to significantly improve retrieval effectiveness as has been realized from various experiments conducted over years now. It also uses the manual query expansion techniques and discusses several automated ways in order to improve retrieval effectiveness. This work, by reviewing the related literature and methodologies, gives an overview of how the techniques of query expansion have been evolving with time and achieved better results in IR systems. The survey offers a valuable resource for researchers and practitioners in information retrieval, shedding light on the advancements, challenges, and future directions in query expansion research.
[...] Read more.One area that has seen rapid growth and differing perspectives from many developers in recent years is document management. This idea has advanced beyond some of the steps where developers have made it simple for anyone to access papers in a matter of seconds. It is impossible to overstate the importance of document management systems as a necessity in the workplace environment of an organization. Interviews, scenario creation using participants' and stakeholders' first-hand accounts, and examination of current procedures and structures were all used to collect data. The development approach followed a software development methodology called Object-Oriented Hypermedia Design Methodology. With the help of Unified Modeling Language (UML) tools, a web-based electronic document management system (WBEDMS) was created. Its database was created using MySQL, and the system was constructed using web technologies including XAMPP, HTML, and PHP Programming language. The results of the system evaluation showed a successful outcome. After using the system that was created, respondents' satisfaction with it was 96.60%. This shows that the document system was regarded as adequate and excellent enough to achieve or meet the specified requirement when users (secretaries and departmental personnel) used it. Result showed that the system developed yielded an accuracy of 95% and usability of 99.20%. The report came to the conclusion that a suggested electronic document management system would improve user happiness, boost productivity, and guarantee time and data efficiency. It follows that well-known document management systems undoubtedly assist in holding and managing a substantial portion of the knowledge assets, which include documents and other associated items, of Organizations.
[...] Read more.A sizeable number of women face difficulties during pregnancy, which eventually can lead the fetus towards serious health problems. However, early detection of these risks can save both the invaluable life of infants and mothers. Cardiotocography (CTG) data provides sophisticated information by monitoring the heart rate signal of the fetus, is used to predict the potential risks of fetal wellbeing and for making clinical conclusions. This paper proposed to analyze the antepartum CTG data (available on UCI Machine Learning Repository) and develop an efficient tree-based ensemble learning (EL) classifier model to predict fetal health status. In this study, EL considers the Stacking approach, and a concise overview of this approach is discussed and developed accordingly. The study also endeavors to apply distinct machine learning algorithmic techniques on the CTG dataset and determine their performances. The Stacking EL technique, in this paper, involves four tree-based machine learning algorithms, namely, Random Forest classifier, Decision Tree classifier, Extra Trees classifier, and Deep Forest classifier as base learners. The CTG dataset contains 21 features, but only 10 most important features are selected from the dataset with the Chi-square method for this experiment, and then the features are normalized with Min-Max scaling. Following that, Grid Search is applied for tuning the hyperparameters of the base algorithms. Subsequently, 10-folds cross validation is performed to select the meta learner of the EL classifier model. However, a comparative model assessment is made between the individual base learning algorithms and the EL classifier model; and the finding depicts EL classifiers’ superiority in fetal health risks prediction with securing the accuracy of about 96.05%. Eventually, this study concludes that the Stacking EL approach can be a substantial paradigm in machine learning studies to improve models’ accuracy and reduce the error rate.
[...] Read more.Artificial Neural Network is a branch of Artificial intelligence and has been accepted as a new computing technology in computer science fields. This paper reviews the field of Artificial intelligence and focusing on recent applications which uses Artificial Neural Networks (ANN’s) and Artificial Intelligence (AI). It also considers the integration of neural networks with other computing methods Such as fuzzy logic to enhance the interpretation ability of data. Artificial Neural Networks is considers as major soft-computing technology and have been extensively studied and applied during the last two decades. The most general applications where neural networks are most widely used for problem solving are in pattern recognition, data analysis, control and clustering. Artificial Neural Networks have abundant features including high processing speeds and the ability to learn the solution to a problem from a set of examples. The main aim of this paper is to explore the recent applications of Neural Networks and Artificial Intelligence and provides an overview of the field, where the AI & ANN’s are used and discusses the critical role of AI & NN played in different areas.
[...] Read more.One of the main reasons for mortality among people is traffic accidents. The percentage of traffic accidents in the world has increased to become the third in the expected causes of death in 2020. In Saudi Arabia, there are more than 460,000 car accidents every year. The number of car accidents in Saudi Arabia is rising, especially during busy periods such as Ramadan and the Hajj season. The Saudi Arabia’s government is making the required efforts to lower the nations of car accident rate. This paper suggests a business process improvement for car accident reports handled by Najm in accordance with the Saudi Vision 2030. According to drone success in many fields (e.g., entertainment, monitoring, and photography), the paper proposes using drones to respond to accident reports, which will help to expedite the process and minimize turnaround time. In addition, the drone provides quick accident response and recording scenes with accurate results. The Business Process Management (BPM) methodology is followed in this proposal. The model was validated by comparing before and after simulation results which shows a significant impact on performance about 40% regarding turnaround time. Therefore, using drones can enhance the process of accident response with Najm in Saudi Arabia.
[...] Read more.The Marksheet Generator is flexible for generating progress mark sheet of students. This system is mainly based in the database technology and the credit based grading system (CBGS). The system is targeted to small enterprises, schools, colleges and universities. It can produce sophisticated ready-to-use mark sheet, which could be created and will be ready to print. The development of a marksheet and gadget sheet is focusing at describing tables with columns/rows and sub-column sub-rows, rules of data selection and summarizing for report, particular table or column/row, and formatting the report in destination document. The adjustable data interface will be popular data sources (SQL Server) and report destinations (PDF file). Marksheet generation system can be used in universities to automate the distribution of digitally verifiable mark-sheets of students. The system accesses the students’ exam information from the university database and generates the gadget-sheet Gadget sheet keeps the track of student information in properly listed manner. The project aims at developing a marksheet generation system which can be used in universities to automate the distribution of digitally verifiable student result mark sheets. The system accesses the students’ results information from the institute student database and generates the mark sheets in Portable Document Format which is tamper proof which provides the authenticity of the document. Authenticity of the document can also be verified easily.
[...] Read more.The healthcare system is a knowledge driven industry which consists of vast and growing volumes of narrative information obtained from discharge summaries/reports, physicians case notes, pathologists as well as radiologists reports. This information is usually stored in unstructured and non-standardized formats in electronic healthcare systems which make it difficult for the systems to understand the information contents of the narrative information. Thus, the access to valuable and meaningful healthcare information for decision making is a challenge. Nevertheless, Natural Language Processing (NLP) techniques have been used to structure narrative information in healthcare. Thus, NLP techniques have the capability to capture unstructured healthcare information, analyze its grammatical structure, determine the meaning of the information and translate the information so that it can be easily understood by the electronic healthcare systems. Consequently, NLP techniques reduce cost as well as improve the quality of healthcare. It is therefore against this background that this paper reviews the NLP techniques used in healthcare, their applications as well as their limitations.
[...] Read more.The numerical value of k in a k-fold cross-validation training technique of machine learning predictive models is an essential element that impacts the model’s performance. A right choice of k results in better accuracy, while a poorly chosen value for k might affect the model’s performance. In literature, the most commonly used values of k are five (5) or ten (10), as these two values are believed to give test error rate estimates that suffer neither from extremely high bias nor very high variance. However, there is no formal rule. To the best of our knowledge, few experimental studies attempted to investigate the effect of diverse k values in training different machine learning models. This paper empirically analyses the prevalence and effect of distinct k values (3, 5, 7, 10, 15 and 20) on the validation performance of four well-known machine learning algorithms (Gradient Boosting Machine (GBM), Logistic Regression (LR), Decision Tree (DT) and K-Nearest Neighbours (KNN)). It was observed that the value of k and model validation performance differ from one machine-learning algorithm to another for the same classification task. However, our empirical suggest that k = 7 offers a slight increase in validations accuracy and area under the curve measure with lesser computational complexity than k = 10 across most MLA. We discuss in detail the study outcomes and outline some guidelines for beginners in the machine learning field in selecting the best k value and machine learning algorithm for a given task.
[...] Read more.Universities across the globe have increasingly adopted Enterprise Resource Planning (ERP) systems, a software that provides integrated management of processes and transactions in real-time. These systems contain lots of information hence require secure authentication. Authentication in this case refers to the process of verifying an entity’s or device’s identity, to allow them access to specific resources upon request. However, there have been security and privacy concerns around ERP systems, where only the traditional authentication method of a username and password is commonly used. A password-based authentication approach has weaknesses that can be easily compromised. Cyber-attacks to access these ERP systems have become common to institutions of higher learning and cannot be underestimated as they evolve with emerging technologies. Some universities worldwide have been victims of cyber-attacks which targeted authentication vulnerabilities resulting in damages to the institutions reputations and credibilities. Thus, this research aimed at establishing authentication methods used for ERPs in Kenyan universities, their vulnerabilities, and proposing a solution to improve on ERP system authentication. The study aimed at developing and validating a multi-factor authentication prototype to improve ERP systems security. Multi-factor authentication which combines several authentication factors such as: something the user has, knows, or is, is a new state-of-the-art technology that is being adopted to strengthen systems’ authentication security. This research used an exploratory sequential design that involved a survey of chartered Kenyan Universities, where questionnaires were used to collect data that was later analyzed using descriptive and inferential statistics. Stratified, random and purposive sampling techniques were used to establish the sample size and the target group. The dependent variable for the study was limited to security rating with respect to realization of confidentiality, integrity, availability, and usability while the independent variables were limited to adequacy of security, authentication mechanisms, infrastructure, information security policies, vulnerabilities, and user training. Correlation and regression analysis established vulnerabilities, information security policies, and user training to be having a higher impact on system security. The three variables hence acted as the basis for the proposed multi-factor authentication framework for improve ERP systems security.
[...] Read more.Markov models are one of the widely used techniques in machine learning to process natural language. Markov Chains and Hidden Markov Models are stochastic techniques employed for modeling systems that are dynamic and where the future state relies on the current state. The Markov chain, which generates a sequence of words to create a complete sentence, is frequently used in generating natural language. The hidden Markov model is employed in named-entity recognition and the tagging of parts of speech, which tries to predict hidden tags based on observed words. This paper reviews Markov models' use in three applications of natural language processing (NLP): natural language generation, named-entity recognition, and parts of speech tagging. Nowadays, researchers try to reduce dependence on lexicon or annotation tasks in NLP. In this paper, we have focused on Markov Models as a stochastic approach to process NLP. A literature review was conducted to summarize research attempts with focusing on methods/techniques that used Markov Models to process NLP, their advantages, and disadvantages. Most NLP research studies apply supervised models with the improvement of using Markov models to decrease the dependency on annotation tasks. Some others employed unsupervised solutions for reducing dependence on a lexicon or labeled datasets.
[...] Read more.The perfect alignment between three or more sequences of Protein, RNA or DNA is a very difficult task in bioinformatics. There are many techniques for alignment multiple sequences. Many techniques maximize speed and do not concern with the accuracy of the resulting alignment. Likewise, many techniques maximize accuracy and do not concern with the speed. Reducing memory and execution time requirements and increasing the accuracy of multiple sequence alignment on large-scale datasets are the vital goal of any technique. The paper introduces the comparative analysis of the most well-known programs (CLUSTAL-OMEGA, MAFFT, BROBCONS, KALIGN, RETALIGN, and MUSCLE). For programs’ testing and evaluating, benchmark protein datasets are used. Both the execution time and alignment quality are two important metrics. The obtained results show that no single MSA tool can always achieve the best alignment for all datasets.
[...] Read more.A sizeable number of women face difficulties during pregnancy, which eventually can lead the fetus towards serious health problems. However, early detection of these risks can save both the invaluable life of infants and mothers. Cardiotocography (CTG) data provides sophisticated information by monitoring the heart rate signal of the fetus, is used to predict the potential risks of fetal wellbeing and for making clinical conclusions. This paper proposed to analyze the antepartum CTG data (available on UCI Machine Learning Repository) and develop an efficient tree-based ensemble learning (EL) classifier model to predict fetal health status. In this study, EL considers the Stacking approach, and a concise overview of this approach is discussed and developed accordingly. The study also endeavors to apply distinct machine learning algorithmic techniques on the CTG dataset and determine their performances. The Stacking EL technique, in this paper, involves four tree-based machine learning algorithms, namely, Random Forest classifier, Decision Tree classifier, Extra Trees classifier, and Deep Forest classifier as base learners. The CTG dataset contains 21 features, but only 10 most important features are selected from the dataset with the Chi-square method for this experiment, and then the features are normalized with Min-Max scaling. Following that, Grid Search is applied for tuning the hyperparameters of the base algorithms. Subsequently, 10-folds cross validation is performed to select the meta learner of the EL classifier model. However, a comparative model assessment is made between the individual base learning algorithms and the EL classifier model; and the finding depicts EL classifiers’ superiority in fetal health risks prediction with securing the accuracy of about 96.05%. Eventually, this study concludes that the Stacking EL approach can be a substantial paradigm in machine learning studies to improve models’ accuracy and reduce the error rate.
[...] Read more.One area that has seen rapid growth and differing perspectives from many developers in recent years is document management. This idea has advanced beyond some of the steps where developers have made it simple for anyone to access papers in a matter of seconds. It is impossible to overstate the importance of document management systems as a necessity in the workplace environment of an organization. Interviews, scenario creation using participants' and stakeholders' first-hand accounts, and examination of current procedures and structures were all used to collect data. The development approach followed a software development methodology called Object-Oriented Hypermedia Design Methodology. With the help of Unified Modeling Language (UML) tools, a web-based electronic document management system (WBEDMS) was created. Its database was created using MySQL, and the system was constructed using web technologies including XAMPP, HTML, and PHP Programming language. The results of the system evaluation showed a successful outcome. After using the system that was created, respondents' satisfaction with it was 96.60%. This shows that the document system was regarded as adequate and excellent enough to achieve or meet the specified requirement when users (secretaries and departmental personnel) used it. Result showed that the system developed yielded an accuracy of 95% and usability of 99.20%. The report came to the conclusion that a suggested electronic document management system would improve user happiness, boost productivity, and guarantee time and data efficiency. It follows that well-known document management systems undoubtedly assist in holding and managing a substantial portion of the knowledge assets, which include documents and other associated items, of Organizations.
[...] Read more.One of the main reasons for mortality among people is traffic accidents. The percentage of traffic accidents in the world has increased to become the third in the expected causes of death in 2020. In Saudi Arabia, there are more than 460,000 car accidents every year. The number of car accidents in Saudi Arabia is rising, especially during busy periods such as Ramadan and the Hajj season. The Saudi Arabia’s government is making the required efforts to lower the nations of car accident rate. This paper suggests a business process improvement for car accident reports handled by Najm in accordance with the Saudi Vision 2030. According to drone success in many fields (e.g., entertainment, monitoring, and photography), the paper proposes using drones to respond to accident reports, which will help to expedite the process and minimize turnaround time. In addition, the drone provides quick accident response and recording scenes with accurate results. The Business Process Management (BPM) methodology is followed in this proposal. The model was validated by comparing before and after simulation results which shows a significant impact on performance about 40% regarding turnaround time. Therefore, using drones can enhance the process of accident response with Najm in Saudi Arabia.
[...] Read more.Universities across the globe have increasingly adopted Enterprise Resource Planning (ERP) systems, a software that provides integrated management of processes and transactions in real-time. These systems contain lots of information hence require secure authentication. Authentication in this case refers to the process of verifying an entity’s or device’s identity, to allow them access to specific resources upon request. However, there have been security and privacy concerns around ERP systems, where only the traditional authentication method of a username and password is commonly used. A password-based authentication approach has weaknesses that can be easily compromised. Cyber-attacks to access these ERP systems have become common to institutions of higher learning and cannot be underestimated as they evolve with emerging technologies. Some universities worldwide have been victims of cyber-attacks which targeted authentication vulnerabilities resulting in damages to the institutions reputations and credibilities. Thus, this research aimed at establishing authentication methods used for ERPs in Kenyan universities, their vulnerabilities, and proposing a solution to improve on ERP system authentication. The study aimed at developing and validating a multi-factor authentication prototype to improve ERP systems security. Multi-factor authentication which combines several authentication factors such as: something the user has, knows, or is, is a new state-of-the-art technology that is being adopted to strengthen systems’ authentication security. This research used an exploratory sequential design that involved a survey of chartered Kenyan Universities, where questionnaires were used to collect data that was later analyzed using descriptive and inferential statistics. Stratified, random and purposive sampling techniques were used to establish the sample size and the target group. The dependent variable for the study was limited to security rating with respect to realization of confidentiality, integrity, availability, and usability while the independent variables were limited to adequacy of security, authentication mechanisms, infrastructure, information security policies, vulnerabilities, and user training. Correlation and regression analysis established vulnerabilities, information security policies, and user training to be having a higher impact on system security. The three variables hence acted as the basis for the proposed multi-factor authentication framework for improve ERP systems security.
[...] Read more.Artificial Neural Network is a branch of Artificial intelligence and has been accepted as a new computing technology in computer science fields. This paper reviews the field of Artificial intelligence and focusing on recent applications which uses Artificial Neural Networks (ANN’s) and Artificial Intelligence (AI). It also considers the integration of neural networks with other computing methods Such as fuzzy logic to enhance the interpretation ability of data. Artificial Neural Networks is considers as major soft-computing technology and have been extensively studied and applied during the last two decades. The most general applications where neural networks are most widely used for problem solving are in pattern recognition, data analysis, control and clustering. Artificial Neural Networks have abundant features including high processing speeds and the ability to learn the solution to a problem from a set of examples. The main aim of this paper is to explore the recent applications of Neural Networks and Artificial Intelligence and provides an overview of the field, where the AI & ANN’s are used and discusses the critical role of AI & NN played in different areas.
[...] Read more.The numerical value of k in a k-fold cross-validation training technique of machine learning predictive models is an essential element that impacts the model’s performance. A right choice of k results in better accuracy, while a poorly chosen value for k might affect the model’s performance. In literature, the most commonly used values of k are five (5) or ten (10), as these two values are believed to give test error rate estimates that suffer neither from extremely high bias nor very high variance. However, there is no formal rule. To the best of our knowledge, few experimental studies attempted to investigate the effect of diverse k values in training different machine learning models. This paper empirically analyses the prevalence and effect of distinct k values (3, 5, 7, 10, 15 and 20) on the validation performance of four well-known machine learning algorithms (Gradient Boosting Machine (GBM), Logistic Regression (LR), Decision Tree (DT) and K-Nearest Neighbours (KNN)). It was observed that the value of k and model validation performance differ from one machine-learning algorithm to another for the same classification task. However, our empirical suggest that k = 7 offers a slight increase in validations accuracy and area under the curve measure with lesser computational complexity than k = 10 across most MLA. We discuss in detail the study outcomes and outline some guidelines for beginners in the machine learning field in selecting the best k value and machine learning algorithm for a given task.
[...] Read more.The perfect alignment between three or more sequences of Protein, RNA or DNA is a very difficult task in bioinformatics. There are many techniques for alignment multiple sequences. Many techniques maximize speed and do not concern with the accuracy of the resulting alignment. Likewise, many techniques maximize accuracy and do not concern with the speed. Reducing memory and execution time requirements and increasing the accuracy of multiple sequence alignment on large-scale datasets are the vital goal of any technique. The paper introduces the comparative analysis of the most well-known programs (CLUSTAL-OMEGA, MAFFT, BROBCONS, KALIGN, RETALIGN, and MUSCLE). For programs’ testing and evaluating, benchmark protein datasets are used. Both the execution time and alignment quality are two important metrics. The obtained results show that no single MSA tool can always achieve the best alignment for all datasets.
[...] Read more.Web applications are becoming very important in our lives as many sensitive processes depend on them. Therefore, it is critical for safety and invulnerability against malicious attacks. Most studies focus on ways to detect these attacks individually. In this study, we develop a new vulnerability system to detect and prevent vulnerabilities in web applications. It has multiple functions to deal with some recurring vulnerabilities. The proposed system provided the detection and prevention of four types of vulnerabilities, including SQL injection, cross-site scripting attacks, remote code execution, and fingerprinting of backend technologies. We investigated the way worked for every type of vulnerability; then the process of detecting each type of vulnerability; finally, we provided prevention for each type of vulnerability. Which achieved three goals: reduce testing costs, increase efficiency, and safety. The proposed system has been validated through a practical application on a website, and experimental results demonstrate its effectiveness in detecting and preventing security threats. Our study contributes to the field of security by presenting an innovative approach to addressing security concerns, and our results highlight the importance of implementing advanced detection and prevention methods to protect against potential cyberattacks. The significance and research value of this survey lies in its potential to enhance the security of online systems and reduce the risk of data breaches.
[...] Read more.Process Mining (PM) and PM tool abilities play a significant role in meeting the needs of organizations in terms of getting benefits from their processes and event data, especially in this digital era. The success of PM initiatives in producing effective and efficient outputs and outcomes that organizations desire is largely dependent on the capabilities of the PM tools. This importance of the tools makes the selection of them for a specific context critical. In the selection process of appropriate tools, a comparison of them can lead organizations to an effective result. In order to meet this need and to give insight to both practitioners and researchers, in our study, we systematically reviewed the literature and elicited the papers that compare PM tools, yielding comprehensive results through a comparison of available PM tools. It specifically delivers tools’ comparison frequency, methods and criteria used to compare them, strengths and weaknesses of the compared tools for the selection of appropriate PM tools, and findings related to the identified papers' trends and demographics. Although some articles conduct a comparison for the PM tools, there is a lack of literature reviews on the studies that compare PM tools in the market. As far as we know, this paper presents the first example of a review in literature in this regard.
[...] Read more.The usefulness of Collaborative filtering recommender system is affected by its ability to capture users' preference changes on the recommended items during recommendation process. This makes it easy for the system to satisfy users' interest over time providing good and quality recommendations. The Existing system studied fails to solicit for user inputs on the recommended items and it is also unable to incorporate users' preference changes with time which lead to poor quality recommendations. In this work, an Enhanced Movie Recommender system that recommends movies to users is presented to improve the quality of recommendations. The system solicits for users' inputs to create a user profiles. It then incorporates a set of new features (such as age and genre) to be able to predict user's preference changes with time. This enabled it to recommend movies to the users based on users new preferences. The experimental study conducted on Netflix and Movielens datasets demonstrated that, compared to the existing work, the proposed work improved the recommendation results to the users based on the values of Precision and RMSE obtained in this study which in turn returns good recommendations to the users.
[...] Read more.